智能论文笔记

Faster Approximate Dynamic Programming by Freezing Slow States

Yijia Wang , Daniel R. Jiang

分类：人工智能 | 机器学习

2023-01-03

We consider infinite horizon Markov decision processes (MDPs) with fast-slow structure, meaning that certain parts of the state space move "fast" (and in a sense, are more influential) while other parts transition more "slowly." Such structure is common in real-world problems where sequential decisions need to be made at high frequencies, yet information that varies at a slower timescale also influences the optimal policy. Examples include: (1) service allocation for a multi-class queue with (slowly varying) stochastic costs, (2) a restless multi-armed bandit with an environmental state, and (3) energy demand response, where both day-ahead and real-time prices play a role in the firm's revenue. Models that fully capture these problems often result in MDPs with large state spaces and large effective time horizons (due to frequent decisions), rendering them computationally intractable. We propose an approximate dynamic programming algorithmic framework based on the idea of "freezing" the slow states, solving a set of simpler finite-horizon MDPs (the lower-level MDPs), and applying value iteration (VI) to an auxiliary MDP that transitions on a slower timescale (the upper-level MDP). We also extend the technique to a function approximation setting, where a feature-based linear architecture is used. On the theoretical side, we analyze the regret incurred by each variant of our frozen-state approach. Finally, we give empirical evidence that the frozen-state approach generates effective policies using just a fraction of the computational cost, while illustrating that simply omitting slow states from the decision modeling is often not a viable heuristic.

translated by 谷歌翻译

On Noisy Evaluation in Federated Hyperparameter Tuning

Kevin Kuo , Pratiksha Thaker , Mikhail Khodak , John Ngyuen , Daniel Jiang , Ameet Talwalkar , Virginia Smith

分类：机器学习

2022-12-17

Hyperparameter tuning is critical to the success of federated learning applications. Unfortunately, appropriately selecting hyperparameters is challenging in federated networks. Issues of scale, privacy, and heterogeneity introduce noise in the tuning process and make it difficult to evaluate the performance of various hyperparameters. In this work, we perform the first systematic study on the effect of noisy evaluation in federated hyperparameter tuning. We first identify and rigorously explore key sources of noise, including client subsampling, data and systems heterogeneity, and data privacy. Surprisingly, our results indicate that even small amounts of noise can significantly impact tuning methods-reducing the performance of state-of-the-art approaches to that of naive baselines. To address noisy evaluation in such scenarios, we propose a simple and effective approach that leverages public proxy data to boost the evaluation signal. Our work establishes general challenges, baselines, and best practices for future work in federated hyperparameter tuning.

translated by 谷歌翻译

Biomedical image analysis competitions: The state of current participation practice

Matthias Eisenmann , Annika Reinke , Vivienn Weru , Minu Dietlinde Tizabi , Fabian Isensee , Tim J. Adler , Patrick Godau , Veronika Cheplygina , Michal Kozubek , Sharib Ali

分类：计算机视觉 | 机器学习

2022-12-16

The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.

translated by 谷歌翻译

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

Teven Le Scao , Angela Fan , Christopher Akiki , Ellie Pavlick , Suzana Ilić , Daniel Hesslow , Roman Castagné , Alexandra Sasha Luccioni , François Yvon , Matthias Gallé

分类：自然语言处理

2022-11-09

Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.

translated by 谷歌翻译

Learning Hierarchical Metrical Structure Beyond Measures

Junyan Jiang , Daniel Chin , Yixiao Zhang , Gus Xia

分类：机器学习

2022-09-21

音乐包含超出节拍和措施的层次结构。尽管层次结构注释有助于音乐信息检索和计算机音乐学，但在当前的数字音乐数据库中，这种注释很少。在本文中，我们探讨了一种数据驱动的方法，以自动从分数中提取分层的度量结构。我们提出了一个具有时间卷积网络条件随机字段（TCN-CRF）体系结构的新模型。给定符号音乐得分，我们的模型以良好的形式采用任意数量的声音，并预测了从偏低级别到截面级别的4级层次级别结构。我们还使用RWC-POP MIDI文件来注释数据集，以促进培训和评估。我们通过实验表明，在不同的编排设置下，提出的方法的性能优于基于规则的方法。我们还对模型预测进行了一些简单的音乐分析。所有演示，数据集和预培训模型均在GitHub上公开可用。

translated by 谷歌翻译

Deep learning at the edge enables real-time streaming ptychographic imaging

Anakha V Babu , Tao Zhou , Saugat Kandel , Tekin Bicer , Zhengchun Liu , William Judge , Daniel J. Ching , Yi Jiang , Sinisa Veseli , Steven Henke

分类：机器学习

2022-09-20

相干显微镜技术提供了跨科学和技术领域的材料的无与伦比的多尺度视图，从结构材料到量子设备，从综合电路到生物细胞。在构造更明亮的来源和高速探测器的驱动下，连贯的X射线显微镜方法（如Ptychography）有望彻底改变纳米级材料的特征。但是，相关的数据和计算需求显着增加意味着，常规方法不再足以从高速相干成像实验实时恢复样品图像。在这里，我们演示了一个工作流程，该工作流利用边缘的人工智能和高性能计算，以实现直接从检测器直接从检测器流出的X射线ptychography数据实时反演。拟议的AI支持的工作流程消除了传统的Ptychography施加的采样约束，从而使用比传统方法所需的数据较少的数据级允许低剂量成像。

translated by 谷歌翻译

MANDO: Multi-Level Heterogeneous Graph Embeddings for Fine-Grained Detection of Smart Contract Vulnerabilities

Hoang H. Nguyen , Nhat-Minh Nguyen , Chunyao Xie , Zahra Ahmadi , Daniel Kudendo , Thanh-Nam Doan , Lingxiao Jiang

分类：机器学习

2022-08-28

由不同类型的节点和边缘组成的学习异质图增强了均匀图技术的结果。这样的图形的一个有趣示例是代表可能的软件代码执行流的控制流图。由于此类图代表了代码的更多语义信息，因此为这些图形开发技术和工具可能对检测软件中的漏洞的可靠性非常有益。但是，现有的异质图技术仍然不足以处理复杂的图形，在处理复杂的图形中，不同类型的节点和边缘数量较大且可变。本文集中于以太坊智能合约作为由构建在控制流图和包含不同类型的节点和链接的呼叫图的异质合同图表示的软件代码样本。我们提出了曼多（Mando），这是一种新的异质图表示，以学习这种异质合同图的结构。 Mando提取自定义的Metapaths，该Metapaths在不同类型的节点及其邻居之间建立了关系连接。此外，它开发了一个多米达异构图注意网络，以学习不同类型的节点及其在异质合同图中的多层嵌入，可以更准确地捕获智能合约的代码语义，并便利两者。 - 水平和粗粒合同级别的漏洞检测。我们对大型智能合同数据集的广泛评估表明，曼多（Mando）在粗粒合同水平上改善了其他技术的脆弱性检测结果。更重要的是，它是第一种基于学习的方法，能够在细粒度的线条层面上识别漏洞，并在F1分数方面将基于代码分析的传统漏洞检测方法显着提高了11.35％至70.81％。

translated by 谷歌翻译

Deep Learning Enabled Time-Lapse 3D Cell Analysis

Jiaxiang Jiang , Amil Khan , S. Shailja , Samuel A. Belteton , Michael Goebel , Daniel B. Szymanski , B. S. Manjunath

分类：计算机视觉

2022-08-17

本文提出了一种延时3D细胞分析的方法。具体而言，我们考虑了准确定位和定量分析亚细胞特征的问题，以及从延时3D共聚焦细胞图像堆栈跟踪单个细胞的问题。细胞的异质性和多维图像的体积提出了对细胞形态发生和发育的完全自动化分析的主要挑战。本文是由路面细胞生长过程和构建定量形态发生模型的动机。我们提出了一种基于深度特征的分割方法，以准确检测和标记每个细胞区域。基于邻接图的方法用于提取分段细胞的亚细胞特征。最后，提出了使用多个单元格特征的基于强大的图形跟踪算法在不同的时间实例中关联单元格。提供了广泛的实验结果，并证明了所提出的方法的鲁棒性。该代码可在GitHub上获得，该方法可通过Bisque Portal作为服务可用。

translated by 谷歌翻译

OpenSRH: optimizing brain tumor surgery using intraoperative stimulated Raman histology

Cheng Jiang , Asadur Chowdury , Xinhai Hou , Akhil Kondepudi , Christian W. Freudiger , Kyle Conway , Sandra Camelo-Piragua , Daniel A. Orringer , Honglak Lee , Todd C. Hollon

分类：计算机视觉 | 机器学习

2022-06-16

准确的术中诊断对于在脑肿瘤手术期间提供安全有效的护理至关重要。我们的护理标准诊断方法是时间，资源和劳动密集型，限制了获得最佳手术治疗的机会。为了解决这些局限性，我们提出了一种替代工作流程，该工作流程结合了刺激的拉曼组织学（SRH），一种快速的光学成像方法，以及对SRH图像的深层自动解释，用于术中脑肿瘤诊断和实时手术决策支持。在这里，我们介绍了OpenSRH，这是来自300多名脑肿瘤患者和1300多个独特全幻灯片光学图像的第一个公共数据集。 OPENSRH包含来自最常见的脑肿瘤诊断，完整的病理注释，整个幻灯片肿瘤分割，原始和加工的光学成像数据的数据，用于端到端模型的开发和验证。我们为使用弱（即患者级）诊断标签的基于补丁的整个幻灯片分类和推断提供了一个框架。最后，我们基准了两项计算机视觉任务：多类组织学脑肿瘤分类和基于斑块的对比表示学习。我们希望OpenSRH能够促进快速光学成像和基于ML的手术决策支持的临床翻译，以提高精密医学时代的癌症手术的获取，安全性和功效。数据集访问，代码和基准可在opensrh.mlins.org上找到。

translated by 谷歌翻译

Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

Aarohi Srivastava , Abhinav Rastogi , Abhishek Rao , Abu Awal Md Shoeb , Abubakar Abid , Adam Fisch , Adam R. Brown , Adam Santoro , Aditya Gupta , Adrià Garriga-Alonso

分类：自然语言处理 | 人工智能 | 机器学习 | (统计)机器学习

2022-06-09

语言模型既展示了定量的改进，又展示了新的定性功能，随着规模的增加。尽管它们具有潜在的变革性影响，但这些新能力的特征却很差。为了为未来的研究提供信息，为破坏性的新模型能力做准备，并改善社会有害的效果，至关重要的是，我们必须了解目前和近乎未来的能力和语言模型的局限性。为了应对这一挑战，我们介绍了超越模仿游戏基准（Big Bench）。 Big Bench目前由204个任务组成，由132家机构的442位作者贡献。任务主题是多样的，从语言学，儿童发展，数学，常识性推理，生物学，物理学，社会偏见，软件开发等等。 Big-Bench专注于被认为超出当前语言模型的功能的任务。我们评估了OpenAI的GPT型号，Google内部密集变压器体系结构和大型基础上的开关稀疏变压器的行为，跨越了数百万到数十亿个参数。此外，一个人类专家评估者团队执行了所有任务，以提供强大的基准。研究结果包括：模型性能和校准都随规模改善，但绝对的术语（以及与评估者的性能相比）；在模型类中的性能非常相似，尽管带有稀疏性。逐渐和预测的任务通常涉及大量知识或记忆成分，而在临界规模上表现出“突破性”行为的任务通常涉及多个步骤或组成部分或脆性指标；社交偏见通常会随着含糊不清的环境而随着规模而增加，但这可以通过提示来改善。

translated by 谷歌翻译